Clustering user interface

ABSTRACT

A user is presented with a display of item labels and category labels, where item labels are shown to the extent that categories contain not more than a threshold count of items. Alternatively, the item labels are shown to the extent that display area is left over after the display of category labels, and the categories for which item labels are shown are selected from the smallest categories to the largest.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation-in-part of U.S. patentapplication No. 08/711,300, filed Sep. 6, 1996. That patent applicationis incorporated by reference herein for all purposes.

BACKGROUND OF THE INVENTION

[0002] This invention relates to computer database searches. Moreprecisely, this invention provides an improved interface for performingcomputer database searches and filtering search results.

[0003] In the typical database search, a user queries the database byselecting a set of criteria (a request) and submitting those criteria toa database engine. The criteria might be in the form of a range ofallowable values for a given field, an upper limit, a lower limit, or anexact match. Multiple field criteria can be combined by the use oflogical operators (e.g., NOT, AND, OR, GREATER_THAN). The criteria mightalso include comparisons between multiple fields (e.g., AGE>=IQ). Oncethe criteria are submitted to the database engine, the database engineselects all the records in the database that meet all of the criteriaselected by the user and returns those records to the user.

[0004] Many different methods have been utilized to facilitate theuser's creation of database requests. One standard for specifyingdatabase search criteria to the database engine is Structured QueryLanguage (SQL). SQL statements are strings of text and numbers whichdefine the search request. If the end user requesting data from adatabase is proficient in SQL, the end user can specify the SQLstatement directly. However, where the end user is not proficient in SQLprogramming, a user interface might be provided to allow the end user tointuitively select elements from the user interface, which are thenconverted into SQL statements for submission to an SQL-capable databaseengine. Many user interfaces which convert user input into SQLstatements are known. In some interfaces, a user answers a series ofquestions, fills out an on-screen form or selects from a finite numberof choices using a mouse or other pointing device.

[0005] Once such search request is submitted, the database enginereturns the records meeting the criteria, if any, and the user interfacedisplays the records or some indication of the records. Several methodshave been used to display the data returned from the database query,such as tables, charts, graphs, and graphic images such as maps or treestructures. The term “maps” as used herein can be geographical maps orlogical maps laying out data points in an N-dimensional coordinatesystem.

[0006] It is also known to allow for refinement of a database searchafter the results of an initial search have been obtained. A refinementsearch is often desired by users where the initial search produces toomany or too few records. With a refinement search, users might edit theinitial search request or add additional criteria to the initial searchrequest. By doing so, users can find the proper quantity and quality ofrecords they need.

[0007] An important factor in determining a proper search criteria isprior knowledge of the database and the distribution of the values ineach field. For example, the user knows that the database contains onlya listing of Democrats, a search for male Republicans over the age offifty living in cities of at least one million people will turn up norecords. The user might then waste time refining the search by firsteliminating any limitation on the size of the city, then eliminating theage limitation, eventually to discover that there are no Republicans inthe database. The opposite result, too many records, could occur if theuser specified too general a set of search criteria. In that case, theuser might also waste time iteratively narrowing the search with littleeffect. Often, the success of a database search is dependent upon luck.

[0008] Another shortcoming of most current search methodologies is thatusers do not gain any knowledge during the search process to help refinetheir search. In some systems, database records, especially geographicaldatabases, present records to the user as a set of dots overlaid on amap. While such an interface might be useful for a few dots, it becomesimpractical for use with a large number of dots.

[0009] With large numbers of dots, either the dots will be too small tobe selected with a pointing device or, if large enough, the dots willobscure other dots. FIG. 1 illustrates the former problem; FIG. 2illustrates the latter.

[0010] One way to avoid the selection problem is to allow the user toclick the pointing device near the point of interest and treat the clickas the selection of all points within a radius (“a radius search”). Ofcourse, this has the disadvantage of selecting too many records, too fewrecords or the wrong records.

[0011] Another way to avoid the selection problem is to have a userselect on arbitrary boundaries such as state boundaries, and gets a textlisting of all the locations in that state, and possibly a furthergraphical selection of a county or region. The state selection is shownin FIG. 3. The disadvantage of this approach is that the user, at most,knows only whether or not a state contains at least one dot.

[0012] In addition to geographical data and other data which can beplotted in a two-dimensional plane (or an N-dimensional coordinatesystem with N being an integer greater than zero), data in searchablehierarchical structures often need to be searched.

[0013] One way to present the data points in a hierarchical structuresearch is to present all the data points. However, with large numbers ofdata points, this is impractical.

[0014] For example, Microsoft Words™ word processing software provides afeature for searching for files among the hierarchical directorystructure of a disk. The matching files are displayed in a directorytree structure showing each of the directors for the files along withtheir parent directories. While this is useful for a small number offiles, the display for a large number of files would not fit on thescreen and thus the user must scroll through the listing or manually“collapse” uninteresting directory structures to be able to see thedirectory structures of interest. This approach does not give the user a“big picture” view of file structures, unless the user knows what thebig picture looks like and creates it for themselves.

[0015] Yet another shortcoming of prior art information displays appearswhen there are more items to be listed than can fit on a display. Onesolution is to list all the items on a scrollable list. The other is togroup the items into categories and display categories first, allow theuser to select one of the categories, then display the items matchingthe selected category. Neither of these approaches is entirelysatisfactory, however. In the former case, too much data and not enoughinformation is presented. In the latter case, depending on thecategories, the display might be underused.

[0016] Therefore, what is needed is an improved search interface whichpresents the user with information and views of the overall data beingsearched, in order to allow for an informed search.

SUMMARY OF THE INVENTION

[0017] In one embodiment of a search interface according to the presentinvention, a user is presented with a display map from which the userselects database records of interest. For the records which are in rangeof the display map's limits, a cluster evaluator groups some of therecords into clusters. Generally, a cluster is a set of records whichwould be clustered on the display map. On the display map, unclusteredrecords are represented by item icons, while clusters are represented bycluster icons. If a user selects an item icon, the associated record isselected and a predetermined action is taken. If a cluster icon isselected, the display map is “zoomed in” to show greater detail aroundthe cluster. Typically, a number of clustered records become unclusteredrecords when the zooming occurs, because the clustering criteria aresharpened. The user can iteratively select clusters, resulting ingreater and greater detail, until the user selects an item icon and thepredetermined action is taken with that record. The predetermined actioncould be any action which might be taken with an individual record, suchas displaying additional fields for the record, initiating a process formanipulating the record, or the like.

[0018] The term “icon,” as used herein, refers to a user interfaceelement which is presented to the user and is selectable by the userusing a pointing device. Depending on context, icons can be regularshapes or irregular shapes, can have identified borders or impliedborders, and can either be illustrated with a graphic that hints to themeaning of the icon or can be a plain area whose meaning is identifiedby context.

[0019] The records are mapped to the display based on field values ofthe record. For example, a two-dimensional map could be used to displayrecords which contain longitude and latitude values. The same principlescould be used for one-dimensional data, three-dimensional data, orhigher dimensional data which can be suitably represented on a two- orthree-dimensional display. If the search data is numerical data from asmany fields as the display has dimensions, cluster icons are graphicallydisplayed as shaded regions on a graph or map, roughly coextensive withthe mappings of the records to the display. Unclustered records aregraphically represented by item icons at the location corresponding tothe field data of the records. Thus, a user can easily see where theclusters are, where the unclustered records are, and regions where norecords exist. Clusters might overlap, so that one record is clusteredinto more than one cluster. By selecting a particular cluster, the userreduces the dataset to a subset which contains only those records whichwere clustered into the selected cluster. The new subset is thenclustered to allow the users to narrow their search further, and theselection process to begin again. The users can end their search at anytime, but eventually, the dataset will be reduced to a set of recordsthat do not cluster.

[0020] The present invention can also be used to present hierarchicalstructures for user selection of records from such structures. Examplesof such hierarchical structures include electronic file storagedirectory structures, organizational charts within large organizations,plain hypertext pages on a Web site, or an intranet. In such cases,clustering is done by using a metric which is the number of links ratherthan a distance in a coordinate system or a weighted metric which takesinto account hierarchies where some links are more important thanothers, such as taxonomy hierarchy. For example, two files in the samedirectory would be deemed to have a metric between them of 0 and a filein a parent directory would have a metric between a file in an immediatesubdirectory of 1. For files which are “cousins,” the metric couldeither be the number of links in the shortest path between the two files(2) or the number of links to a common ancestor (1). Such a clusteringprocess might prove to be highly useful in sorting through corporatedata, if it is clustered and organized according to an organizationalchart, wherein information provided by departments nearer to the user'sdepartment is given more prominence than information from departmentswhich are “farther” away on the organization chart. Presentation ofhierarchical information would also be greatly simplified using thepresent invention, as many search engines or searching Worldwide Websites tend to either show too many hits or too few hits. For example, alarge URL (uniform resource locator) database such as YAHOO!'s database(http://www.yahoo.com/) would provide too many references matching theword “team” to be of use to a searcher. However, with the clusteringuser interface of the present invention and a starting point, such asthe “Oakland Raiders” subcategory of Yahoo!'s hierarchical subjectdatabase, the user could be presented with all the hits, where many ofthe hits are clustered into a single entry. The user would be providedwith a cluster icon to “open” that cluster, possibly leading to otherclusters. Instead of a URL database, the invention could also be appliedto a “yellow pages” database.

[0021] In order to achieve a manageable number of clusters theclustering algorithm can incrementally increase the minimum distanceseparating two clusters. As the distance increases, the number ofclusters will decrease. When the total number of clusters becomes lessthan a predetermined threshold, that distance is used in the clusteringalgorithm. An alternative method for incremental clustering would be toinitially create clusters of records that are separated by very smalldistances and then gradually combine these clusters into bigger clustersuntil the total number of clusters becomes less than the threshold. Inorder to prevent creating too few clusters, the clustering process canbe modified to terminate before all cluster pair distances arecalculated. For instance, if a current step of clustering is combiningall clusters less than five miles apart, the clustering process couldterminate when the goal of getting the total number of clusters of lessthan twenty has been achieved even though there are still clusters lessthan five miles apart remaining.

[0022] In yet another application of clustering, a list of categorizeditems are displayed as a list of labels, where a label is a textualand/or graphical indication of what the label represents. Each labelrepresents either an item or a category, where an item label representsan individual item, while a category label represents an entire categoryof items. A display manager determines which items are shownindividually and which are grouped by category so that the full displayis used. One method of filling a display is to set a threshold count andto tag for individual display all items in categories having no morethan the threshold count of items and to tag for category display theitems in categories having more than the threshold count of items. Thethreshold count is then adjusted so that the labels just fill thedisplay. Thus, if there are too many labels, the threshold count israised, and if there are too few labels, the threshold count is lowered,until the threshold count reaches one, at which point the display willshow all of the items with item labels.

[0023] A further understanding of the nature and advantages of theinventions herein may be realized by reference to the remaining portionsof the specification and the attached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0024]FIG. 1 is a clickable map with small squares on the maprepresenting individual records in a database of records where eachrecord has a location within the map of the continental United States.

[0025]FIG. 2 is a clickable map with large squares as used in the priorart.

[0026]FIG. 3 is a clickable map where a state of interest is selected.

[0027]FIG. 4 is a front view of a computer system which could be used toimplement the present invention.

[0028]FIG. 5 is a block diagram of the components of the computer systemshown in FIG. 4 as they relate to the present invention.

[0029]FIG. 6 is a clickable map according to the present invention withunclustered item icons and cluster icons are overlaid on a map.

[0030]FIG. 7 is a clickable map which results from user selection inFIG. 6 of the cluster under the mouse cursor.

[0031]FIG. 8 is a flowchart of a display process according to thepresent invention.

[0032]FIG. 9 is a block diagram of a networked system over which thepresent invention could be used.

[0033]FIG. 10 is a map of a directory structure which can be searchedusing the present invention.

[0034]FIG. 11 is a block diagram of a display in which items arerepresented by item labels and category labels.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0035] The present invention can operate on a wide variety of computerswith a number of suitable configurations, each being the preferredembodiment for a particular implementation. For example, a palmtopimplementation might be preferred for use with a restaurant findingdevice designed to be used by a person walking around looking for a goodrestaurant meeting the person's criteria. A desktop model, with highresolution graphics and a direct Internet connection might be preferredfor an office-based application. This detailed description of theinvention will describe an embodiment using a desktop computer and anembodiment using the Internet (the global internetwork of networksgenerally referred to by that name), although it should be understoodthat many other configurations of computing devices might also be usedto provide similar functionality. For example, the user might be anInternet or Intranet surfer, a system administrator, an emailadministrator or an interactive television viewer/user. The computingdevice might also be a mainframe terminal or computer kiosk.

[0036]FIG. 4 shows a typical desktop computer 10 made up of a system box12 enclosing a hard disk 14, a keyboard 16, a mouse 18, and a monitor 20including a display surface 22. As is well known in the art, desktopcomputer 10 can be programmed with programs stored on hard disk 14 toperform various tasks. These tasks might involve accepting user inputfrom keyboard 16 and mouse 18, and/or displaying images or text ondisplay surface 22. The input from mouse 18 is in the form of eithermovement in two dimensions (rolling the mouse) or clicking (pressing oneof the mouse buttons). It is also well known to replace mouse 18 withequivalent two-dimensional pointing devices, or even three-dimensionalpointing devices. With suitable software or hardware, monitor 20 can bemade to display three-dimensional images, as is well known in thegraphic arts.

[0037]FIG. 5 is a block diagram illustrating the typical contents ofsystem box 12 in more detail. A central processing unit (CPU) 202executes programs stored in a program memory 204 by reading programinstructions from program memory 204 via processor bus 206. Of course,multiple CPU's could be provided for faster operation in some computersystems. CPU 202 interprets input signals from mouse 18 provided on bus206 after being suitably transformed by a mouse driver 208 which mightbe hardware, software, or a combination of both. CPU 202 uses a RAM(random access memory) 210 for storage of program variables as needed.Bus 206 also provides a path for CPU 202 to send output signals todisplay surface 22 of monitor 20 (see FIG. 4) via a display driver 222and video display memory 224. If used, desktop computer 10 cancommunicate over a network with a network connection 218.

[0038] To program desktop computer 10 according to the presentinvention, hard disk 14 is loaded with a map database 220 and one ormore programs. To begin executing a program which implements the presentinvention, CPU 202 causes the program to be loaded from hard disk 14into program memory 204. In a typical personal desktop computer, programmemory 204 and RAM 210 are different segments of one memory structure.As shown in FIG. 5, the program includes a cluster evaluator 230, adisplay generator 232, an input processor 234, a database queryinterface 236 and a record action module 238. Other components, such asa database engine, are included but not shown in FIG. 5.

[0039] Using the system described in FIGS. 4-5, a user will selectrecords from map database 220 and have a selected record acted on byrecord action module 238. A process of using the system according to thepresent invention is described in more detail in connection with FIGS.6-7. First, two examples of the prior art will be described. In both theprior art and the presently described inventive embodiment are userinterfaces in which the user's goal is to select a record according togeographical location. One use of such a system is to allow users toselect one store of a national chain of stores and perform an actionwith respect to that store (e.g., dial that store's telephone number,get a map of surface streets giving directions to the store, getlistings of hours of operation, products carried, manager name, etc.).

[0040]FIG. 1 shows one method of record selection known in the priorart. Each of the small black squares on the map represents one store. Toselect a store, the user positions a mouse cursor 250 over the desiredone of the black squares and clicks a mouse button to select the store.While that may be useful in areas where there are few stores, such asColorado and Oklahoma, it would be very difficult for the user toposition the mouse exactly to select the desired store if it were inCalifornia, Washington, Texas or the East Coast. The selectiondifficulty increases if the display uses, instead of small blacksquares, icons displaying the company's logo.

[0041]FIG. 2 shows what a user interface might look like if iconsdisplaying a company's logo were to be presented in each dot. As FIG. 2shows, the logo icons would be completely unwieldy even with only a fewlocations.

[0042]FIG. 3 shows one way to avoid the problems illustrated in FIGS.1-2. Instead of the user having to select one store from all thenational stores, the user instead selects the state of interest. In FIG.3, the user has positioned the cursor and selected the state of Wyoming.In response, the computer would show an expanded, zoom display of thestate of Wyoming. As could have been seen from FIG. 1 or FIG. 2, theselection of Wyoming is fruitless because there are no stores inWyoming. On the other hand, the user might have selected California,which does have many stores. However, the computer might need to zoom inone more level if too many stores are represented in the state view.

[0043]FIG. 6 shows an improved display 300 according to the presentinvention. Overlaid on a map of the contiguous United States, display300 includes item icons 302 and cluster icons 304. Item icons 302represent the locations of individual stores which are not clusteredwith other stores as determined by cluster evaluator 230. Using mousecursor 310, the user can select either a cluster icon 304 or an itemicon 302. If the user selects an item icon 302, the computer respondswith the predetermined item action. If the user selects a cluster icon,such as the one shown under mouse cursor 310, the computer zooms in tothe map shown in FIG. 7. As should be apparent from FIG. 6, the user caneasily identify where there are no stores and where there arecollections of stores. Note that the data and the cluster icons are notnecessarily constrained by state or other boundaries. The shape of thecluster icons are determined by the locations of their constituentitems, as explained below.

[0044] Referring now to FIG. 7, the zoomed in map 320 now shows severalindividual stores in a second layer of clusters. The process of zoomingcontinues until the user aborts or selects an item icon. Note that, inFIGS. 6-7, the individual icons can be large enough to show thecompany's logo (“K” for the imaginary sunglass retailer “Kool Shades”used for this example) without interfering with the selection process.

[0045] The display maps shown in FIGS. 6-7 are generated by computersystem 10 according to the process shown in the flowchart of FIG. 8.That flowchart includes a series of steps numbered in ascending orderwhich are executed in that order except where indicated. In the firststep (S1), the computer determines the map bounds. For a national chainsuch as Kool Shades, the default initial map bounds are defined by therectangle enclosing the contiguous United States. In step S2, thecomputer determines which records are within the map bounds. These arethe records the user can select among.

[0046] Of the selectable records, cluster membership is determined (S3).Each cluster has a set of records which are its members. In theillustrations provided here, each record is either a member of nocluster or a member of one cluster. However, the system could bedesigned to allow a record to be in more than one cluster, so long aseach cluster contains a subset of records significantly smaller in sizethan the set of all records. Many of the different clustering algorithmsin different fields of math and science will function adequately forthis purpose, depending on the goals of the search and type of data.Other clustering factors, such as the metric used and the normalizationof metrics, play important roles in the search process.

[0047] One method for clustering uses a geographic distance metric. Eachrecord for a Kool Shades store includes the longitude and latitude ofthe store. Using this information, the geographic distance between twostores is easily calculated using known techniques.

[0048] With other types of data, a metric can be generated from thevalues for the data. For example, if the user were selecting records notbased on a geographic location, but based on the records' “locations” ona plot of two field values, those field values might need to benormalized. One way to normalize the values is to scale each axis of theplot so that the values fall within a square. This is equivalent, for afirst field and a second field, to dividing each first field value bythe range of first field values and dividing each second field value bythe range of second field values.

[0049] Although not required for distance calculations, the values canbe transformed so that all normalized values fall between zero and one.Thus, a new value, V′, for a field value can be calculated from theoriginal field value, V, with the formula:

V′=K(V−L)/(H−L)

[0050] where L is the lowest field value and H is the highest fieldvalue. The multiplicative constant K is set to one to keep all values inthe range from zero to one, but it could be set to any other suitablevalue.

[0051] In any case, not only can a position in the two-dimensionaldisplay space can be found for any record, the “distance” between anytwo records can be found. These distances are used to determine whichrecords cluster with which other records. One method for clustering datais to consider each record as an N-dimensional sphere (a circle in thecase of Kool Shades stores). Any records whose spheres intersect wouldbe clustered together. The radius of the spheres may depend on the needsof the application, the total number of records, the distance betweenmap boundaries and the density of records in particular locations.

[0052] In the case of exclusive clustering (each record is a member ofat most one cluster), the following pseudocode describes the process ofassigning records to clusters:

[0053] Main Program: Find All Clusters

[0054] Begin Loop (for each Record)

[0055] If Record is Unclustered then Begin

[0056] Create New Cluster

[0057] Add_Record (Record, New_Cluster)

[0058] End If Then

[0059] End Loop

[0060] End Program

[0061] Procedure Add_Record (Record, Cluster)

[0062] Put Record in Cluster

[0063] Begin Loop (for each Unclustered Record)

[0064] If (Distance (Record, Unclustered Record)<Threshold) thenAdd_Record (Unclustered Record, Cluster)

[0065] End Loop

[0066] End Procedure

[0067] The result of this process is that each unclustered record willbe assigned to one cluster and any record which is within a sphereradius of a clustered record will be a member of that cluster. As shouldbe apparent, the size of the cluster is often dependent on thedistribution of its members. In the above program implementation, allrecords are assigned to clusters, with the unclustered records eachbeing the sole member of their cluster. Although, technically, they maybe assigned to such solo clusters, they are considered to beunclustered.

[0068] Once the clusters and unclustered records are determined, thecluster icon shapes are calculated (S4). Of course, if the data is notgraphically displayable, this step need not be done. A cluster icon canjust be a shaded area covering the union of the circles for each recordin the cluster. An algorithmically easier method for determining shapeis to consider each record as a square or a cube. The shape of thecluster is then derived by using polygons to outline the shape of theunion of the squares or cubes.

[0069] In step S5, the map is overlaid with cluster icons and itemicons. In step S6, the map and overlays are displayed to the user. Asexplained above, the user can easily see where unclustered records are,where clusters are, the clusters' extents and the areas where no recordsare to be found.

[0070] In step S7, user input is accepted and in step S8, the type ofuser input is evaluated. If the user clicks on the map outside an itemicon or a cluster icon, the program flow is from step S8 back to S7(i.e., no action is performed). If the user selects an item icon, thepredetermined action for items is performed (S9), be it displaying astreet map with directions to the individual store or other data aboutthat individual record. If the user selects a cluster icon, the programproceeds to step S10, where a new, zoomed map is generated. Once the newmap is generated, the program loops back to step S2 to generate a newset of included records and the program repeats steps S2 through S8.Where applicable, the predetermined action could be a display of therecord in detail showing all field values, presenting an editing screenfor updating the record, initiating a phone call or an e-mail message toa place or person referred to in the record, or anything else that couldbe done with just one record.

[0071] A concrete example is presented here for further explanation ofthe process. Suppose data fields to be clustered were latitude andlongitude and the cluster icons and item icons were to be displayed on a400×400 pixel map of the contiguous United States. A radius of 8 pixelscould then be used to cluster 1000 records scattered over a map of theUnited States. Since the scale on the US map is roughly 10 miles/pixel,records within 80 miles of each other would be clustered (i.e., theinitial cluster threshold would be 80 miles). If the user selects acluster and the zoomed in map has a resolution of roughly 2 miles/pixel,records within 16 miles of each other will then be clustered. The resultof the larger scale is that more locations are separated and moresub-clusters are created. Of course, with dynamic thresholding, thethreshold distance is adjusted to obtain a suitable number of clusters.

[0072] The basic embodiment of the invention having been described, twovariations will now be described. FIG. 9 is a block diagram of anembodiment of a mapping system 800 using an Internet connection. Inmapping system 800, a Web browser 802 is used by the user to interactwith a Web server 804 via a TCP/IP interface 806 and the Internet 808.Web server 804 provides maps overlaid with icons to Web browser 802,which displays the maps and allows the user to click on points on thatmap. Web browser 802 returns to Web server 804 and mouse clicks whichWeb server 804 would need to have to generate a new map. Web server 804obtains the overlaid maps from a map overlay system 810 which operatessubstantially as described above, using a map database 812 to obtainmaps and store database 814 to obtain records about the location ofstores. Of course, the records mapped need not be store records.

[0073] Initially, map overlay system 810 will request the initial mapfrom map database 812. Typically, map database 812 is a data structurecontaining map data, but also includes an engine interface whereby mapoverlay system 810 requests maps by specifying a scale and the fouredges of the desired map. As the user selects points on the map, mapoverlay system 810 will generate new maps according to the processdescribed in connection with FIG. 8.

[0074] Mapping system 800 is useful where the user does not control themap data or the store data. It also allows for easy updating of storelocations and other data in store database 814 because it iscentralized. Mapping system 800 could also be attached to a company'sWeb site is such as way that the store location function appears to bepart of the company's Web site, but is actually served by a map serviceprovider. This allows for easy updates of map serving software and theunderlying maps, since all of that is in one place.

[0075] In situations where the clusters and individual records cannot bedisplayed in a graph or map, the representation of the clusters can beperformed by providing the user with summary statistics and summarygraphs of the fields and/or combination of fields in each cluster. Theuser would use this summary information to select his/her desiredcluster or record. One such example is shown in FIG. 10, where a filedirectory structure to be navigated is laid out. Analogous to thelocation maps, distances between files can be determined, so they can beclustered. The distance between two files might be the number of linksto be traversed from one file to the other, or the number of links froma common ancestor of both files to the file furthest from the commonancestor.

[0076]FIG. 10 shows a selection display for a hierarchical directorystructure. The labels shown with cross-hatching are cluster icons forthe hierarchical directory list and item icons are shown with graylabels. For example, the cross-hatched label “repair” refers to adirectory of that name which contains files meting the search criteria.The file “WINPRINT.DLL” is an individual file meting the searchcriteria. If the user selects one of the cluster items, a new andexpanded display is presented showing the files and clusters underneaththe selected subdirectory in more detail. FIG. 10 shows twelve icons,which are either cluster icons or item icons. As each display isgenerated, the substitution of a cluster icon for a plurality ofindividual item icons is performed until sufficiently few individualicons remain. Of course, in some cases, such as a single directoryhaving hundreds of files meting the search criteria cannot beeffectively clustered using just a file's subdirectory as the clusterparameter.

[0077]FIG. 11 illustrates the use of clustering to optimally fill adisplay 900. Display 900 is a display which might be used to gaininsight into a database of cars available for sale. In this particularexample, if all the cars for sale were shown, display 900 would list4911 items, which cannot be shown all at once in a visible,understandable manner. As should also be apparent, since the cars aregrouped into only four categories, limiting the display to onlycategories, only half of the available display space would be used. Withthe approach of the present invention, however, some of the items areshown by item labels and others are represented by category labels. InFIG. 11, three item labels 902 and four category labels 904 are shown(however, one of the category labels, 904 a, does not represent anyitems, as all of its items are represented by item labels 902).

[0078] The labels 902, 904 which represent items or categories areunderlined in the display to indicate to the user that the label can beselected to obtain information about the data which the labelrepresents. Thus, selecting (via a “mouse click” on the item label orother methods) an item label 904 will result in a display of informationabout the selected item, whereas selecting a category label will resultin a display of item labels for items in that category, a request formore information, a listing of items using subcategory labels, or somecombination of these.

[0079] As shown in FIG. 11, the “Electric” category is opened up to showall of its items, but none of the other categories are opened. If theywere, the listing would not fit into display 900, because at least 31more lines would be used (if the next smallest category, “Antique”, isopened). To automatically determine which categories to open, aprocessor, such as the CPU shown in FIG. 5 and described above,determines how much space is available after the display of the categorylabels, and then opens categories beginning with the smallest category,until there is no space left in the display to open another category.Alternatively, the processor might set a threshold count and tentativelyopen each category having a number of items equal to or less than thethreshold count and then adjust the threshold count to shrink or expandthe list to best fit the display area.

[0080]FIG. 11 also shows dynamic data fields 906. These fields are usedto display information about an item or a category when a cursor 908 ismoved over the item or category. If cursor 908 is over a category, datafields 906 present summary information about the items in that category,such as the average value for a field in the records for those items. Ifcursor 908 is over an item, data fields 906 present detailed informationabout the item. The detailed information can either be the specificvalues for the same field as used when cursor 908 is over a category orother information.

[0081] Using the category labels and item labels, a display area can beefficiently populated, while giving the user an indication of theoverall organization of the data in the database.

[0082] The above description is illustrative and not restrictive. Manyvariations of the invention will become apparent to those of skill inthe art upon review of this disclosure. The scope of the inventionshould, therefore, be determined not with reference to the abovedescription, but instead should be determined with reference to theappended claims along with their full scope of equivalents.

What is claimed is:
 1. A method of presenting a user with selection ofitems, wherein each item is characterized as being a member of onecategory selected from a plurality of categories, the method comprisingthe steps of: determining a number of categories which are to berepresented in a display; determining a number of selections which canbe presented at one time in the display, a selection being either anitem or a category; displaying at least one item label and at least onecategory label, where the category label represents the items in thecategory, where the items and categories displayed are determined as afunction of the number of categories, the number of selections and athreshold count, wherein items from categories with no more than thethreshold count are represented by individual item labels and items fromcategories with more than the threshold count are collectivelyrepresented by one of the at least one category labels.
 2. The method ofclaim 1 , further comprising the step of accepting user input selectionof either an item label or a category label.
 3. The method of claim 2 ,further comprising the step of generating a regenerated displayfollowing the selection of a label, wherein the regenerated display is adisplay filtered according to the selected label.
 4. The method of claim1 , further comprising a step of setting the threshold count to a valuewhich is calculated to fill the display with labels.
 5. A method ofpresenting a user with selection of items, wherein each item ischaracterized as being a member of one category selected from aplurality of categories, the method comprising the steps of: determininga number of categories which are to be represented in a display;determining a display area; determining how much of the display areawould be left over area, if any, after display of category labelsrepresenting items; designating at least one category as an opencategory, based on the number of items in the category; displaying anitem label for each item in the at least one open category and otheropen categories, if any, and a category label for each unopen category.6. The method of claim 5 , wherein the step of designating at least onecategory as an open category is a step of designating categories as openfrom the smallest category to the largest category until a category isreached which will result in a display of item labels and categorylabels which will fill but not exceed a display area.
 7. The method ofclaim 5 , further comprising steps of: when a cursor is over a categorylabel, displaying summary information about the items in the categoryassociated with the category label; and when the cursor is over an itemlabel, displaying detailed information about the item associated withthe item label.